Self-Organising Maps in Document Classification: A Comparison with Six Machine Learning Methods
نویسندگان
چکیده
This paper focuses on the use of self-organising maps, also known as Kohonen maps, for the classification task of text documents. The aim is to effectively and automatically classify documents to separate classes based on their topics. The classification with self-organising map was tested with three data sets and the results were then compared to those of six well known baseline methods: k-means clustering, Ward’s clustering, k nearest neighbour searching, discriminant analysis, Naïve Bayes classifier and classification tree. The self-organising map proved to be yielding the highest accuracies of tested unsupervised methods in classification of the Reuters news collection and the Spanish CLEF 2003 news collection, and comparable accuracies against some of the supervised methods in all three data sets.
منابع مشابه
A Comparison of Support Vector Machines and Self-Organizing Maps for e-Mail Categorization
This paper reports on experiments in multi-class document categorization with support vector machines and self-organizing maps. A data set consisting of personal e-mail messages is used for the experiments. Two distinct document representation formalisms are employed to characterize these messages, namely a standard word-based approach and a character n-gram document representation. Based on th...
متن کاملLearning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کاملOn Document Classification with Self-Organising Maps
This research deals with the use of self-organising maps for the classification of text documents. The aim was to classify documents to separate classes according to their topics. We therefore constructed self-organising maps that were effective for this task and tested them with German newspaper documents. We compared the results gained to those of k nearest neighbour searching and k-means clu...
متن کاملSurvey on Supervised Classification using Self Organising Maps
Image classification is an important topic in digital image processing, and it could be solved by pattern recognition methods. This paper is a survey based on Self Organising Maps used as a supervised algorithm for image classification. It is observed that SOM can be used as a supervised method, and can have better advantages: better predictions, easier to interpret and better stability. Keywor...
متن کاملAnalytic Comparison of Audio Feature Sets using Self-Organising Maps
A wealth of different feature sets for analysing music has been proposed and employed in several different Music Information Retrieval applications. In many cases, the feature sets are compared with each other based on benchmarks in supervised machine learning, such as automatic genre classification. While this approach makes features comparable for specific tasks, it doesn’t reveal much detail...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011